perm filename AGENDA[DIS,DBL]5 blob
sn#224591 filedate 1976-07-10 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00007 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 .NSECP(Control Structure)
C00005 00003 .SSEC(AM's Search)
C00015 00004 .SSEC(Constraining AM's Search)
C00021 00005 .SSEC(The Agenda)
C00022 00006 . SSSEC(Why an Agenda?)
C00027 00007 . SSSEC(Details of the Agenda scheme)
C00039 ENDMK
C⊗;
.NSECP(Control Structure)
.BEGIN TURN ON "{}"
AM is one of those awkward programs whose representations only make
sense if you already understand how they will be operated on. A
discussion of AM's control structure (this chapter and the next) must
thus precede a discussion of concepts and how they are represented
(Chapter {[2] KNOWL}). Section {[2]EXAM1}.{[2]REPRSUM}
gave the reader a
sufficient knowledge of AM's "anatomy" to follow these chapters. Thus
armed with a cursory knowledge of the "statics" of AM, we shall
proceed to describe in detail its "dynamics".
Section {SECNUM}.1 will give the reader a feeling for the immensity
of AM's search space. This is the "problem". Section 2 will give the
top-level "solution": the flow of control is governed by a job-list,
an agenda of plausible tasks. Section {SECNUM}.3 will present some
details of this global control scheme.
Chapter {[2] HEURS} deals with the way AM's heuristics operate; this
could be viewed as the "low-level" or ⊗4local⊗* control structure of
AM. Chapter {[2] KNOWL} contains some detailed information about the
actual concepts (and heuristics) AM starts with, and a little more
about their design and representation.
The reader is also directed to
Appendix {[2] EXAM2}, which presents
several detailed examples of AM "in action".
.END
.AGENDA: SECNUM ;
.SSEC(AM's Search)
Let's first spend a paragraph reviewing how concepts are stored. AM
contains a collection of data structures, called ⊗4concepts⊗*. Each
concept is meant to coincide intuitively with one mathematical idea
(e.g., Sets, Union, Trichotomy). As such, a concept has several
aspects or parts, called ⊗4facets⊗* (e.g., Examples, Definitions,
Domain/range, Worth). If you wish to think of a concept as a
"frame", then its facets are "slots" to be filled in. Each facet of
a concept will either be totally blank, or else will contain a bunch
of ⊗4entries⊗*. For example, the Algorithms facet of the concept
Union may point to several equivalent LISP functions, each of which
can be used to form the union of two sets$$ The reasons for having
multiple algorithms is that sometimes AM will want one that is fast,
sometimes AM will be more concerned with economizing on storage,
sometimes AM will want to "analyze" an algorithm, and for that
purpose it must be a very un-optimized function, etc. $. Even the
"heuristic rules" are merely entries on the appropriate kind of facet
(e.g., the entries on the Interest facet of the Structure concept are
rules for judging the interestingness of Structures$$ A typical such
rule is: "A structure is very interesting if all its elements are
mildly interesting in precisely the same way." $).
At any moment, AM contains a couple hundred concepts, each of which
has only ⊗4some⊗* of its facets filled in. AM starts with 115
concepts, and grows to about 300 concepts before running out of
time/space. Most facets of most concepts are totally blank. AM's
basic activity is to select some facet of some concept, and then try
to fill in some entries for that slot$$ This is not quite complete.
In addition to filling in entries for a given facet/concept pair, AM
may wish to check it, split it up, reorganize it, etc. $. Thus the
primitive kind of "⊗4task⊗*" for AM is to deal with a particular
facet/concept pair. A typical task looks like this:
.B816
Check the entries on the "Domain/range" facet of the "Bag-Insert"
concept
.E
If the average concept has ten or twenty blank facets, and there are
a couple hundred concepts, then clearly there will be about
20x200=4000 "fill-in" type tasks for AM to work on, at any given
moment. If several hundred facets have recently been filled in,
there will be that many "check-entries" type tasks available. It
turns out that executing a task will take around 10 cpu seconds, so
only a small percentage of these tasks can ever be executed.$$ Given
only a few hours of cpu time, on a PDP-10 KI-10, running compiled
Interlisp. $
Since most of these tasks will never be explored, what will make AM
appear smart -- or stupid -- are its choices of which task to pick at
each moment. So it's worth AM's spending a nontrivial amount of time
deciding which task to execute next. On the other hand, it had better
not be ⊗4too⊗* much time, since a task does take only 10 seconds.
One question that must be answered is: What percentage of AM's legal
moves (at any typical moment) would be considered intelligent
choices, and what percentage would be irrational? The answer comes
from empirical results. The percentages vary wildly depending on the
previous few tasks. Sometimes, AM will be obviously "in the middle"
of a sequence of tasks, and only one or two of the legal tasks would
seem plausible. Other times, AM has just completed an investigation
by running into dead-ends, and there may be hundreds of tasks it
could choose and not be criticized. The median case would perhaps
permit about 6 of the legal tasks to be judged reasonable.
It is important for AM to locate one of these currently-plausible
tasks, but it's not worth spending much time deciding which of
⊗4them⊗* to work on next. AM still faces a huge search: find one of
the 6 winners out of a few thousand candidates.
Its choice of tasks is made even more important due to the 10-second
"cycle time" -- the time to investigate/execute one task. A human
user is watching, and ten seconds is a nontrivial amount of time to
him. He can therefore observe, perceive, and analyze each and every
task that AM selects. Even just a few bizarre choices will greatly
lower his opinion of AM's intelligence. The trace of AM's actions is
what counts, not its final results. So AM can't draw much of its
apparent intelligence from the ⊗4speed⊗* of the computer.
Chess-playing programs have had to face the dilemma of the trade-off
between "intelligence" (foresight, inference, processing,...) and
total number of board situations examined. In chess, the
characteristics of current-day machines have to date unfortunately
still favored fast, nearly-blind$$ i.e., using a very simple static
evaluation function. $ search. Although machine speed may allow
blind search to win over symbolic inference for ⊗4shallow⊗* searches,
it can't provide any more than a constant speed-up factor for an
exponential search.
Since the number of "legal moves" for AM at any moment is in the
thousands, it is unrealistic to consider "systematically"$$ e.g.,
exhaustively, or using αααβ minimaxing, etc. $ walking through the
entire space that AM can reach. In AM's problem domain, there is so
much "freedom" that symbolic inference finally ⊗4can⊗* win over the
"simple but fast" exploration strategy$$ This is the author's
opinion, partially supported by the results of AM. Paul Cohen
disagrees, feeling that machine speed should be the key to an
automated mathematician's success. $.
.SSEC(Constraining AM's Search)
A great deal of heuristic knowledge is required to
constrain the necessary processing effectively, to zero in on a good task
to tackle
next. This is done in two stages.
.BN
λλ A list of plausible facet/concept pairs is maintained. Nothing can
get onto this list unless there is some reason why filling in (or checking) that
facet of that concept would be worthwhile.
λλ All the plausible tasks on this "job list" are ranked by the
number and strength of the different reasons supporting them. Thus
the facet/concept pairs near the top of the list will all be very
promising tasks to work on.
.E
The first of these constraints is akin to replacing a ⊗4legal⊗* move
generator by a ⊗4plausible⊗* move generator. The second kind of
constraint is akin to using a heuristic evaluation function to select
the best move from among the plausible ones.
The job-list or ⊗4agenda⊗* is a data structure which is a natural way
to store the results of these procedures. It is (1) a list of all
the plausible tasks which have been generated, and (2) it is kept
ordered by the numeric estimate of how worthwhile each task is. A
typical entry on the agenda might look like this:
.WBOX(10,22) ; TABS 12,50 TURN ON "\"
MBOX Fill in the EXAMPLES facet of the PRIMES concept $
MBOX ⊗8~⊗* $
MBOX ⊗8~⊗* $
MBOX ⊗8~⊗* ⊗4Reasons for filling in this facet⊗* $
MBOX ⊗8~⊗* $
MBOX ⊗8↓⊗* $
MBOX \⊗8⊂∞α→\⊃ $
MBOX \⊗8~⊗6 1. No examples of primes are known so far. \⊗8~⊗* $
MBOX \⊗8~⊗6 2. Focus of attention: AM just defined Primes. \⊗8~⊗* $
⊗8~\%∞α→\$∞ →~
MBOX ⊗8~⊗* $
MBOX ⊗8~⊗* $
MBOX ⊗8~⊗* ⊗4Overall value of these reasons⊗* $
MBOX ⊗8~⊗* $
MBOX ⊗8↓⊗* $
MBOX ⊗2250⊗* $
.EBOX
.COMMENT The funny line above is due to the fact that the MBOX separator is $ ;
The actual top-level control structure is simply to pluck the top
task from the agenda and execute it. That is, select the
facet/concept pair having the best supporting reasons, and try to
fill in that facet of that concept.
While a task is being executed, some new tasks might get proposed and
merged into the agenda. Also, some new concepts might get created,
and this, too, would generate a flurry of new tasks.
After AM stops filling in entries for the facet specified in the
chosen task, it removes that task from the agenda, and moves on
whichever task is the highest-rated at that time.
.ONCE TURN ON "{}"
The reader probably has a dozen good$$ A question is "good" if (i) I
have anticipated it, and (ii) It has a snazzy answer. A typical "bad"
question will begin "Yes, but why didn't you...".
As M. Schlick points out [ref], the real killer is "SO WHAT?"
$ questions in mind
at this point (e.g., How do the reasons get rated?, How do the tasks
get proposed?, What happens after a task is selected?,...). The next
section should answer most of these. Some more judgemental ones (How
dare you propose a numeric calculus of plausible reasoning?! If you
slightly de-tune all those numbers, does the system's performance
fall apart?...) will be answered in Chapter {[2] EVALU}.
.SSEC(The Agenda)
. SSSEC(Why an Agenda?)
This subsection provides motivation for the following one, by arguing
that a job-list scheme is a natural mechanism to use to manage the
task-selection problem AM faces. Recall that AM must zero in on one
of the best few tasks to perform next, and it repeatedly makes this
choice. At each moment, there might be thousands of directions to
explore (plausible tasks to consider).
If all the legal tasks were written out, and reasons were thought up
to support each one, then perhaps we could order them by the strength
of those reasons, and thereby settle on the "best" task to work on
next. In order to appear "smart" to the human user, AM should
⊗4never⊗* execute a task having no reasons attached.
.ONCE TURN ON "{}"
Some magical function will be assumed to exist, which provides a
numeric rating, a priority value, for any given task. The function
looks at a given facet/concept pair, examines all the associated
reasons supporting that task, and computes an estimate of how worthwhile it would be
for AM to spend some time now working on that facet of that concept.
So AM will maintain a list of those legal tasks which have some good
reasons tacked onto them, which justify why each task should be
executed, why it is plausible. At least implicitly, AM has a numeric
rating for each task. The obvious control algorithm is to choose the
task with the highest rating, and work on that one next.
Assuming the tasks on this list are kept ordered by this numeric
rating, then AM can just repeatedly pluck the highest task and
execute it. While it's executing, some new tasks might get proposed
and added to the list of tasks. Reasons are kept tacked onto each
task on this list, and form the basis for the numeric priority
rating.
Give or take a few features, this notion of a "job-list" is the one
which AM uses. It is also called an ⊗4agenda⊗*.$$ Borrowed from
Kaplan's term for the job-list present in KRL. [Bobrow & Winograd]
For an earlier general discussion of agendas, see [Knuth, v1]. $
"A task on the agenda" is the same as "a job on the job-list" is the
same as "a facet/concept pair which has been proposed" is the same as
"an active node in the search space". Henceforth, I'll use the
following all interchangably: task, facet/concept pair, node, job.
This should break up the monotony$$ and cover my sloppiness.
Seriously, thanks to English, each of these terms will conjure up a
slightly different image: a "job" is something to do, a "node" is an
item in a search space, "facet/concept pair" reminds you of the
format of a task. $.
. SSSEC(Details of the Agenda scheme)
.AGENDASEC: SECNUM
.AGENDASSEC: SSECNUM
.AGENDAPAGE:
At each moment, AM has many plausible tasks (hundreds or even thousands) which
have been suggested for some good reason or other, but haven't been
carried out yet. Each task is at the level of working on a certain
facet of a certain concept: filling it in, checking it, etc. Recall
that each task also has tacked onto it a list of symbolic reasons
explaining why the task is worth doing.
.XGENLINES←XGENLINES-1
.ONCE TURN ON "{}"
In addition, a number (between 0 and 1000) is attached ⊗4to each
reason⊗*, representing some absolute measure of the value of that
reason (at the moment).
One global formula$$ Here is that formula:
{TURN ON "↑↓" }
Worth(J) = ||SQRT(SUM R↓i↑2)|| x α[ 0.2xWorth(A) +
0.3xWorth(F) + 0.5xWorth(C)α], where J = job to be judged = (Act A,
Facet F, Concept C), and α{R↓iα} are the ratings of the reasons
supporting J. For the sample job pictured in the box, A=Fillin,
F=Examples, C=Sets, α{R↓iα}=α{100,100,200α}. The formula will be repeated --
and explained -- in Section {[2]HEURS}.{[2]FORMULASSEC}, on
page
{[2]FORMULA}. $ combines all the reasons' values into a single priority
value for the task as a whole.
This overall rating is taken to indicate how worthwhile it would be for
AM to bother executing that task, how interesting the task
would probably turn out to be.
The "intelligence" of AM's selection of task is thus seen
to depend on this one formula.
Yet experiments
show (see page {[3] EXPTPAGE})
that its precise form is not important. We conclude
that the "intelligence" has been pushed down into the careful
assigning of reasons (and ⊗4their⊗* values) for each proposed task.
.COMMENT Beware of the braces in the last para.;
.XGENLINES←XGENLINES-1
A typical entry on the agenda might look like this:
.WBOX(9,11)
MBOX TASK: Fill-in examples of Sets $
MBOX PRIORITY: 300 $
MBOX REASONS: $
MBOX 100: No known examples for Sets so far. $
MBOX 100: Failed to fillin examples of Set-union, for lack of examples of Sets $
MBOX 200: Focus of attention: AM recently worked on the concept of Set-union $
.EBOX
Notice the similarity of this to the initial few lines which AM types
just after it selects a job to work on.
The flow of control is simple:
AM picks the task
with the highest priority value, and tries to execute it.
As a side effect, new jobs occasionally get added to the agenda
while the task is being executed.
The global priority value of the
task also indicates how much time and space this task deserves. The
sample task above might rate 20 cpu seconds, and 200 list cells. When
either of these resources is used up, AM terminates work on the
task, and proceeds to pick a new one.
These two limits will be referred to in the sequel as "⊗4time/space quanta⊗*"
which are allocated to the chosen task.
Whenever several techniques exist for satisfying some task, the remaining
time/space quanta are divided evenly among those alternatives; i.e., each
method is tried for a small time. This parceling out of time quanta is
related to Carl Hewitt's notion of "activation energy" [Plasma ref...].
In the case of filling in
examples of sets, the space quantum (200 cells) will be used up quickly
(long before the 20 seconds expire).
There are two big questions now:
.BN
λλ Exactly how is a task proposed and ranked?
.INDENT 10,0,0
How is a plausible new task first formulated?
How do the supporting reasons for the task get assigned?
How does each reason get assigned an absolute numeric rating?
Does a task's priority value change? When and how?
.ONCE INDENT 4,0,0
λλ How does AM execute a task, once it's chosen?
Exactly what can be done during a task's execution?
.END
.ONCE TURN ON "{}α"
The next chapter will deal with both of these questions.
A detailed discussion of difficulties and
limitations of these ideas can be found in Section
{[2]DIFSECNUM}.{[1]DIFSSECNUM}, on page {[3]DIFSEC}.